AITopics

Country: Asia > China > Hong Kong (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsFeb-12-2026, 02:43:26 GMT

d313b4a8c88eba7f0542c489899cec77-Paper-Conference.pdf

algorithm, international conference, optimization, (13 more...)

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-30-2025

H+: An Efficient Similarity-Aware Aggregation for Byzantine Resilient Federated Learning

Zuo, Shiyuan, Fan, Rongfei, Zhan, Cheng, Xu, Jie, Zhao, Puning, Hu, Han

Federated Learning (FL) enables decentralized model training without sharing raw data. However, it remains vulnerable to Byzantine attacks, which can compromise the aggregation of locally updated parameters at the central server. Similarity-aware aggregation has emerged as an effective strategy to mitigate such attacks by identifying and filtering out malicious clients based on similarity between client model parameters and those derived from clean data, i.e., data that is uncorrupted and trustworthy. However, existing methods adopt this strategy only in FL systems with clean data, making them inapplicable to settings where such data is unavailable. In this paper, we propose H+, a novel similarity-aware aggregation approach that not only outperforms existing methods in scenarios with clean data, but also extends applicability to FL systems without any clean data. Specifically, H+ randomly selects $r$-dimensional segments from the $p$-dimensional parameter vectors uploaded to the server and applies a similarity check function $H$ to compare each segment against a reference vector, preserving the most similar client vectors for aggregation. The reference vector is derived either from existing robust algorithms when clean data is unavailable or directly from clean data. Repeating this process $K$ times enables effective identification of honest clients. Moreover, H+ maintains low computational complexity, with an analytical time complexity of $\mathcal{O}(KMr)$, where $M$ is the number of clients and $Kr \ll p$. Comprehensive experiments validate H+ as a state-of-the-art (SOTA) method, demonstrating substantial robustness improvements over existing approaches under varying Byzantine attack ratios and multiple types of traditional Byzantine attacks, across all evaluated scenarios and benchmark datasets.

artificial intelligence, clean data, machine learning, (16 more...)

2509.2433

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Sadananda, Arjun, Banavar, Ravi, Arya, Kavi

Robust Orientation Estimation with TRIAD-aided Manifold EKF

arXiv.org Artificial IntelligenceSep-30-2025

Abstract-- The manifold extended Kalman filter (Manifold EKF) has found extensive application for attitude determination. Magnetometers employed as sensors for such attitude determination are easily prone to disturbances by their sensitivity to calibration and external magnetic fields. The TRIAD (Tri-Axial Attitude Determination) algorithm is well-known as a sub-optimal attitude estimator . In this article, we incorporate this sub-optimal feature of the TRIAD in mitigating the influence of the magnetometer reading in the pitch and roll axis determination in the Manifold EKF algorithm. We substantiate our results with experiments. Accurate orientation estimation is critical for a wide range of applications, such as in Unmanned Aerial V ehicles (UA Vs), mobile devices and robotics. Numerous studies have been dedicated to improving sensor orientation estimation.

artificial intelligence, estimator, orientation, (16 more...)

2509.23456

Country: Asia > India > Maharashtra (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Robots (0.34)

arXiv.org Artificial IntelligenceSep-18-2025

Controllable Pareto Trade-off between Fairness and Accuracy

Du, Yongkang, Zhao, Jieyu, Yang, Yijun, Zhou, Tianyi

The fairness-accuracy trade-off is a key challenge in NLP tasks. Current work focuses on finding a single "optimal" solution to balance the two objectives, which is limited considering the diverse solutions on the Pareto front. This work intends to provide controllable trade-offs according to the user's preference of the two objectives, which is defined as a reference vector. To achieve this goal, we apply multi-objective optimization (MOO), which can find solutions from various regions of the Pareto front. However, it is challenging to precisely control the trade-off due to the stochasticity of the training process and the high dimentional gradient vectors. Thus, we propose Controllable Pareto Trade-off (CPT) that can effectively train models to perform different trade-offs according to users' preferences. CPT 1) stabilizes the fairness update with a moving average of stochastic gradients to determine the update direction, and 2) prunes the gradients by only keeping the gradients of the critical parameters. We evaluate CPT on hate speech detection and occupation classification tasks. Experiments show that CPT can achieve a higher-quality set of solutions on the Pareto front than the baseline methods. It also exhibits better controllability and can precisely follow the human-defined reference vectors.

evolutionary algorithm, machine learning, natural language, (19 more...)

2509.13651

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.93)
(2 more...)

Neural Information Processing SystemsAug-19-2025, 04:54:26 GMT

d313b4a8c88eba7f0542c489899cec77-Supplemental-Conference.pdf

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Neural Information Processing SystemsAug-19-2025, 04:54:22 GMT

d313b4a8c88eba7f0542c489899cec77-Paper-Conference.pdf

artificial intelligence, evolutionary algorithm, machine learning, (13 more...)

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Artificial IntelligenceJun-19-2025

Generalized Reference Kernel With Negative Samples For Support Vector One-class Classification

Raitoharju, Jenni

This paper focuses on small-scale one-class classification with some negative samples available. We propose Generalized Reference Kernel with Negative Samples (GRKneg) for One-class Support Vector Machine (OC-SVM). We study different ways to select/generate the reference vectors and recommend an approach for the problem at hand. It is worth noting that the proposed method does not use any labels in the model optimization but uses the original OC-SVM implementation. Only the kernel used in the process is improved using the negative data. We compare our method with the standard OC-SVM and with the binary Support Vector Machine (SVM) using different amounts of negative samples. Our approach consistently outperforms the standard OC-SVM using Radial Basis Function kernel. When there are plenty of negative samples, the binary SVM outperforms the one-class approaches as expected, but we show that for the lowest numbers of negative samples the proposed approach clearly outperforms the binary SVM.

artificial intelligence, machine learning, negative sample, (13 more...)

2506.14895

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

arXiv.org Artificial IntelligenceMar-24-2024

Multi-Task Learning with Multi-Task Optimization

Bai, Lu, Gupta, Abhishek, Ong, Yew-Soon

Multi-task learning solves multiple correlated tasks. However, conflicts may exist between them. In such circumstances, a single solution can rarely optimize all the tasks, leading to performance trade-offs. To arrive at a set of optimized yet well-distributed models that collectively embody different trade-offs in one algorithmic pass, this paper proposes to view Pareto multi-task learning through the lens of multi-task optimization. Multi-task learning is first cast as a multi-objective optimization problem, which is then decomposed into a diverse set of unconstrained scalar-valued subproblems. These subproblems are solved jointly using a novel multi-task gradient descent method, whose uniqueness lies in the iterative transfer of model parameters among the subproblems during the course of optimization. A theorem proving faster convergence through the inclusion of such transfers is presented. We investigate the proposed multi-task learning with multi-task optimization for solving various problem settings including image classification, scene understanding, and multi-target regression. Comprehensive experiments confirm that the proposed method significantly advances the state-of-the-art in discovering sets of Pareto-optimized models. Notably, on the large image dataset we tested on, namely NYUv2, the hypervolume convergence achieved by our method was found to be nearly two times faster than the next-best among the state-of-the-art.

algorithm, optimization, subproblem, (15 more...)

2403.16162

Country:

Asia > Singapore (0.04)
Asia > China > Fujian Province > Xiamen (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Tuan, Tran Anh, Dung, Nguyen Viet, Thang, Tran Ngoc

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

arXiv.org Artificial IntelligenceFeb-4-2024

Controllable Pareto front learning (CPFL) approximates the Pareto solution set and then locates a Pareto optimal solution with respect to a given reference vector. However, decision-maker objectives were limited to a constraint region in practice, so instead of training on the entire decision space, we only trained on the constraint region. Controllable Pareto front learning with Split Feasibility Constraints (SFC) is a way to find the best Pareto solutions to a split multi-objective optimization problem that meets certain constraints. In the previous study, CPFL used a Hypernetwork model comprising multi-layer perceptron (Hyper-MLP) blocks. With the substantial advancement of transformer architecture in deep learning, transformers can outperform other architectures in various tasks. Therefore, we have developed a hyper-transformer (Hyper-Trans) model for CPFL with SFC. We use the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

controllable pareto front learning, hyper-transformer model, optimal solution, (10 more...)